Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix OFED clash with OHPC #465

Draft
wants to merge 23 commits into
base: main
Choose a base branch
from
Draft

Fix OFED clash with OHPC #465

wants to merge 23 commits into from

Conversation

sjpb
Copy link
Collaborator

@sjpb sjpb commented Nov 1, 2024

  • Simplifies Packer image build variables
  • By default, removes logs from image build
  • Bumps OFED to version 24.07-0.6.1.0 for both RL8 and RL9
  • Adds tests

TODO: check fat image build doesn't downgrade OFED packages
TODO: check lustre install doesn't remove verbs.h
TODO: after merge, move custom base images to -latest
TODO: update image build docs

@sjpb
Copy link
Collaborator Author

sjpb commented Nov 1, 2024

Base build: https://github.com/stackhpc/ansible-slurm-appliance/actions/runs/11627423791

RL9 build failed b/c trivy scan ratelimit
RL8 build succeded, then cancelled before ratelimit could cause problem
RL9 image built manually

@sjpb
Copy link
Collaborator Author

sjpb commented Nov 1, 2024

@sjpb
Copy link
Collaborator Author

sjpb commented Nov 5, 2024

NB: above fatimage build is wrong, as it won't be on the -ofed24 image!

@sjpb sjpb force-pushed the fix/ofed-ohpc branch 4 times, most recently from 3bd1c25 to b7388f5 Compare November 5, 2024 17:32
@sjpb sjpb force-pushed the fix/ofed-ohpc branch 2 times, most recently from 2f4601e to 0a05381 Compare November 5, 2024 21:29
@sjpb sjpb force-pushed the fix/ofed-ohpc branch 3 times, most recently from 42c03ac to 3bdac83 Compare November 6, 2024 09:03
@sjpb
Copy link
Collaborator Author

sjpb commented Nov 6, 2024

@sjpb
Copy link
Collaborator Author

sjpb commented Nov 6, 2024

@sjpb
Copy link
Collaborator Author

sjpb commented Nov 6, 2024

Nightly build above failing: cuda with package clash, non-cuda during shutdown??

Try with logs not being removed: https://github.com/stackhpc/ansible-slurm-appliance/actions/runs/11706321077

@sjpb
Copy link
Collaborator Author

sjpb commented Nov 8, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant